NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Curriculum Design for Machine Learners in Sequential Decision Tasks

https://doi.org/10.1109/TETCI.2018.2829980

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (August 2018, IEEE Transactions on Emerging Topics in Computational Intelligence)

Full Text Available
Curriculum Design for Machine Learners in Sequential Decision Tasks

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (April 2017, AAMAS)

Existing machine-learning work has shown that algorithms can bene t from curricula---learning fi rst on simple examples before moving to more difficult examples. While most existing work on curriculum learning focuses on developing automatic methods to iteratively select training examples with increasing difficulty tailored to the current ability of the learner, relatively little attention has been paid to the ways in which humans design curricula. We argue that a better understanding of the human-designed curricula could give us insights into the development of new machine-learning algorithms and interfaces that can better accommodate machine- or human-created curricula. Our work addresses this emerging and vital area empirically, taking an important step to characterize the nature of human-designed curricula relative to the space of possible curricula and the performance benefits that may (or may not) occur.
more » « less
Full Text Available
Interactive Learning from Policy-Dependent Human Feedback

MacGlashan, James; K Ho, Mark; Loftin, Robert; Peng, Bei; Wang, Guan; Roberts, David L.; Taylor, Matthew E.; Littman, Michael L. (July 2017, ICML)

This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback. Much previous work on this problem has made the assumption that people provide feedback for decisions that is dependent on the behavior they are teaching and is independent from the learner’s current policy. We present empirical results that show this assumption to be false—whether human trainers give a positive or negative feedback for a decision is influenced by the learner’s current policy. Based on this insight, we introduce Convergent Actor-Critic by Humans (COACH), an algorithm for learning from policy-dependent feedback that converges to a local optimum. Finally, we demonstrate that COACH can successfully learn multiple behaviors on a physical robot.
more » « less
Full Text Available
A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (July 2016, Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems)

As robots become pervasive in human environments, it is important to enable users to effectively convey new skills without programming. Most existing work on Interactive Reinforcement Learning focuses on interpreting and incorporating non-expert human feedback to speed up learning; we aim to design a better representation of the learning agent that is able to elicit more natural and effective communication between the human trainer and the learner, while treating human feedback as discrete communication that depends probabilistically on the trainer's target policy. This work entails a user study where participants train a virtual agent to accomplish tasks by giving reward and/or punishment in a variety of simulated environments. We present results from 60 participants to show how a learner can ground natural language commands and adapt its action execution speed to learn more efficiently from human trainers. The agent's action execution speed can be successfully modulated to encourage more explicit feedback from a human trainer in areas of the state space where there is high uncertainty. Our results show that our novel adaptive speed agent dominates different fixed speed agents on several measures of performance. Additionally, we investigate the impact of instructions on user performance and user preference in training conditions.
more » « less
Full Text Available
A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans

Peng, Bei; MacGlashan, James; Loftin, Robert; Littman, Michael L.; Roberts, David L.; Taylor, Matthew E. (January 2016, AAMAS)

As robots become pervasive in human environments, it is important to enable users to effectively convey new skills without programming. Most existing work on Interactive Reinforcement Learning focuses on interpreting and incorporating non-expert human feedback to speed up learning; we aim to design a better representation of the learning agent that is able to elicit more natural and effective communication between the human trainer and the learner, while treating human feedback as discrete communication that depends probabilistically on the trainer’s target policy. This work entails a user study where participants train a virtual agent to accomplish tasks by giving reward and/or punishment in a variety of simulated environments. We present results from 60 participants to show how a learner can ground natural language commands and adapt its action execution speed to learn more efficiently from human trainers. The agent’s action execution speed can be successfully modulated to encourage more explicit feedback from a human trainer in areas of the state space where there is high uncertainty. Our results show that our novel adaptive speed agent dominates different fixed speed agents on several measures of performance. Additionally, we investigate the impact of instructions on user performance and user preference in training conditions.
more » « less
Full Text Available
Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning

https://doi.org/10.1007/s10458-015-9283-7

Loftin, Robert; Peng, Bei; MacGlashan, James; Littman, Michael L.; Taylor, Matthew E.; Huang, Jeff; Roberts, David L. (January 2016, Autonomous Agents and Multi-Agent Systems)

Full Text Available

Search for: All records